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LISA is the upcoming space-based Gravitational Wave telescope. LISA Pathfinder, to be launched 
in the coming years, will prove and verify the detection principle of the fundamental Doppler link 
of LISA on a flight hardware identical in design to that of LISA. LISA Pathfinder will collect a 
picture of all noise disturbances possibly affecting LISA, achieving the unprecedented pureness of 
geodesic motion necessary for the detection of gravitational waves. The first steps of both missions 
will crucially depend on a very precise calibration of the key system parameters. Moreover, robust 
parameters estimation is of fundamental importance in the correct assessment of the residual force 
noise, an essential part of the data processing for LISA. In this paper we present a maximum 
likelihood parameter estimation technique in time domain being devised for this calibration and show 
its proficiency on simulated data and validation through Monte Carlo realizations of independent 
noise runs. We discuss its robustness to non-standard scenarios possibly arising during the real-life 
mission, as well as its independence to the initial guess and non-gaussianities. Furthermore, we 
apply the same technique to data produced in mission-like fashion during operational exercises with 
a realistic simulator provided by ESA. 



I. INTRODUCTION 

LISA [TJ [2] is the proposed space-based Gravitational 
Waves (GWs) observatory planned to fly by the next 
decade. It is based on three Spacecrafts (SCs) — each 
hosting and protecting two Test Masses (TMs) in nom- 
inal free fall — flying in a 5 Million km sided trian- 
gular formation around the Sun at 1 AU. A total of 
6 TMs, whose displacements are detected by a laser- 
interfcrometric technique, constitute 6 Doppler links, two 
per LISA arm, tracking the local curvature variations 
around the Sun and being sensitive to the small fluctua- 
tions induced by GW signals in the 0.1 — 100 mHz band. 

One (any) arm of LISA is virtually shrunk 3J to 38 cm 
and implemented in the LISA Pathfinder (LPF) mis- 
sion [IHB]. LPF will effectively measure the differential 
force noise that pollutes the sensitivity of LISA below 
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3x 10~ 14 ms" 2 Hz" 1 / 2 around 1 mHz — the minimum per- 
formance level for LISA to carry on its science program 
in astrophysics. 

The observational horizon of LISA will include thou- 
sands of GW sources. Among all, the highest Signal- 
to-Noise sources will be surely the Super-Massive Black 
Holes (SMBHs). However, there are sources that are at 
the limit of the LISA sensitivity for which an accurate 
assessment of the instrumental noise is mandatory. The 
population of the Extreme Mass Ratio Inspirals (EMRIs) 
[7] is the most important example: they are a valuable 
instrument to test general relativity and curvature in the 
strong gravity regime. Different EMRI search methods 
have been developed. After having subtracted the high- 
est signals (SMBHs and calibration binaries), in order to 
extract the EMRI signatures, all methods strictly have to 
deal with the instrumental noise level, for which the LPF 
mission has a crucial role. In fact, a systematic error in 
the reconstructed noise shape would dramatically affect 
the identification of such sources. The methods described 
in this paper allows for a solution of this problem. 
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The main pay load on-board LPF, the LISA Technol- 
ogy Package (LTP) [5], will thus be used in an extensive 
characterization campaign by measuring all force distur- 
bances and systematics. To the purpose, a precise cali- 
bration of the key system parameters must be performed 
before any assessment of the final level of differential force 
noise can be made. The full process is iterative: the qual- 
ity of free fall achieved at a given stage of the mission 
depends on the results of the previous experiments. By 
proceeding in the direction of increasing precision, the 
observed noise will be fully explained. 

In LPF as a physical system the relationship between 
sensed displacements and applied station-keeping forces 
plays a crucial role. Hence, the effect of such forces must 
be taken into account and subtracted from the data, in 
order to provide a successful estimate of the external 
residual force noise. To this end and to invert the system 
dynamics the calibration of all key system parameters is 
required, a problem we address and solve by maximum 
likelihood parameter estimation in time domain. Prelim- 
inary work was presented in [8], we hereby extend the 
method, present it in a more robust fashion and apply it 
to simulated datasets as well as to more realistic simula- 
tion data released by ESA. 

The paper is structured as follows. In Section [TT] we 
provide a general description of a multi-controlled dy- 
namical system and show the procedure to obtain the 
external residual out-of-loop force noise. Then, we apply 
the formalism to the LPF mission and provide a model 
for the dynamics along the two most relevant degrees of 
freedom. In Section ITO1 we demonstrate how the method 
is capable of correctly identifying all parameters and han- 
dling the problem of degeneracy by collecting the infor- 
mation from different experiments aimed at exciting dif- 
ferent degrees of freedom of the system. In Section [TV] we 
report on the results of our investigations. More in de- 
tails, in Section |IV A| we discuss the data production, in 
|IV B| the whitening filters and finally in |IV C| the param- 
eter estimation. Section IV C 1 presents the Monte Carlo 



validation of the method; IV C 2 and IV C 3 describe 



the proficiency on applying the same technique to non- 
standard scenarios, corresponding to a poorly calibrated 
(or strongly under-performing) system (robustness to the 
initial guess) and a readout affected by glitches (robust- 
ness to non-gaussianities) . Section IV D| provides an ex- 
ample of analysis of data produced by the ESA LPF sci- 
ence simulator, more realistic but treated in part as a 
black-box. Finally, in Section [IV E| we discuss the overall 
impact of the method to the estimation of the residual 
force noise. 



II. DYNAMICS AND SYSTEM 
IDENTIFICATION 

The LISA link is ideally composed of two TMs working 
as mirrors and whose relative displacement is tracked by 
a laser interferometer. While controlled along the other 



degrees of freedom, the TMs are left in nominal free- 
fall along the optically sensed axis. In LISA there is 
actually no direct measurement of the differential TM 
displacement, but a combination of local measurements 
(TM to local optical bench) and the one between two far 
apart SCs. 

LPF holds two main conceptual differences with re- 
spect to LISA: the differential motion is directly detected 
by a laser interferometer and the second TM is con- 
trolled along the sensitive axis. Indeed, in the main sci- 
ence mode, LPF is a controlled dynamical system where 
the reference TM is in free fall along the sensitive axis 
and electrostatically suspended along the other degrees 
of freedom. Interferometric readouts are used by the con- 
troller to compute specific commands sent to the actu- 
ators to drive the SC and the second TM to follow the 
reference TM. The control of the separation of the SC to 
the reference TM is called drag-free loop; the control of 
the separation of the second TM and the reference TM is 
called electrostatic suspension loop. In this way, LPF is 
actually a dynamical system with coupled control loops. 

The full equation of motion expressed in the interfer- 
ometer sensing coordinates (o) can be written as (see 
Appendix [A] for details) 



D S" 1 



f + C • T ■ O; 



(1) 



where D contains the derivatives, C is the control ma- 
trix, T maps the delays and S the sensing strategy. Oi 
are setpoint injections in the interferometer channels and 
o n is the readout noise vector, f represent the external 
forces. We have also defined the second-order differenti- 
ation operator 



D S 



(2) 



Two transfer function matrices can be naturally identi- 
fied 



H 



Oi— S-O 
Ho^f 



A 1 C T 
A . 



(3) 
(4) 



In particular, the second is of fundamental relevance as 
it shows that the differentiation operator allows to es- 
timate the out-of-loop external residual force per unit 
mass. However, such evaluation requires to handle the 
effect of the controller and calibrate the parameters con- 
tained in the matrices D, S and C. Hence, the first trans- 
fer function is used for the system identification or, equiv- 
alently, for the estimation of all system parameters in 
dedicated experiments; the second to estimate the force 
per unit mass by applying the calibrated operator A on 
data with interferometer noise only (all deterministic in- 
puts are set to zero). 

Considering the model along the optically-sensed axis 
described in Eq. and referring to Eq. ( A4 ) for the no- 
tation, the computation of the residual force noise re- 
quires the calibration of the following minimal set of pa- 
rameters: 
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o io\ and w 2 2 : residual oscillator- like couplings be- 
tween the SC and the reference TM and between 
the two TMs, the first one typically ~1 x 10" 6 s" 2 ; 

o S21: the sensing cross-talk between o\ and 012 
channels, typically ~lxlO" 4 ; 

Adf and A sus : actuation gains for application of the 
forces by the thrusters and the electrostatic suspen- 
sions, typically ~1; 

o Ati and At 2 : delays in the application of the ac- 
tuation of the forces by the thrusters and the elec- 
trostatic suspensions, typically some fraction of a 
second. 

Amongst the series of experiments characterizing the 
LTP, a few of capital importance will tackle the measure- 
ment of the mentioned parameters. We hereby consider 
two main identification experiments where the injections 
are signals modifying the interferometer zero point: 

1. injection into the controller guidance of the 0\ chan- 
nel, namely o; 1; 

2. injection into the controller guidance of the 012 
channel, namely 0^12. 

Clearly the o\ readout of the first experiment is highly 
sensitive to w 2 , A^i and At%; the 012 readout is sensitive 
to almost all parameters, in particular to S21, A sus and 
Ati. However, in the second experiment the 0\ readout 
serves as a sanity check since it is expected to carry no 
information (the cross-talk from the differential channel 
to the first one is negligible). The combination of the 
information from the two available experiments makes 
the identification of all 7 parameters feasible. 



III. MAXIMUM LIKELIHOOD ESTIMATION IN 
TIME DOMAIN 

The colored noise shapes expected in the LPF mission 
force us to develop a rather general formalism to identify 
the system parameters. 

Let us suppose that our measurement is stored in the 
time series (ti,Xi) with i = 1, iVdata> and the model 
to fit is x(ti,p), p being the vector of all parameters. 
We can define the residuals between data and model as 
Ti = Xi — x(ti,p). The general recipe is to build the 
likelihood estimator 



£(data|p) = const e 



X 2 (p)/2 



(5) 



where the argument is the norm of residuals (log- 
likelihood) 



X 2 (p)Hr(p)|r(p)) , 



(6) 



easily identifiable with the usual least square estimator 
when the noise is uncorrelated and Gaussian distributed. 



The inner product (-|-) that defines the norm in the pre- 
ceding equation for discrete values is given by 



(g\h) = g t • c; 1 • h 



(7) 



where g and h denote two generic functions evaluated at 
discrete times and C n is the noise covariance matrix. 

Therefore, the parameter estimation task for LPF con- 
sists of either maximizing the likelihood in Eq. ^ or min- 
imizing the norm (log-likelihood) in Eq. ([6]). 



A. Multi-experiment analysis 

As described in Section [TTJ the simplest system iden- 
tification consists of at least two experiments (with two 
interferometer readings each) to be performed in-flight. 
Therefore, it is necessary to develop a general estimation 
method that includes the information coming from many 
experiments and measurement channels. This can also 
solve the issue of the possible parameter degeneracy by 
increasing the total information of the system. 

The first intuitive attempt is to fit each experiment 
(or even each single channel) independently, obtain the 
various parameter estimates and combine them to get 
the final result. The underlying philosophy is to accu- 
mulate information about the full parameter set. Let us 
recall the definition of the Fisher information matrix on 
the best-fit estimates defined as the Hessian of the log- 
likelihood 



if 



kl 



dpkdp. 



x 2 (p) 



(8) 



Let us also suppose that py are the parameter estimates, 
for i and j counting the experiments and the readings 
per each experiment and Xij is the relative Fisher infor- 
mation matrix. It can be shown [5] that the combined 
estimate is an information-weighted average 



iV cxpa N che 

p = t 1 ■ £ x y • p« ' 



(9) 



=1 3=1 



where the total Fisher information matrix is 



iVexps JVcha 

t=l J=l 



(10) 



In terms of a covariance matrix, the preceding formulae 
are a generalization of the covariance-weighted mean. 

Following the principle of Ockham's razor, one usually 
wants to fit the smallest set of parameters. In Section |ll| 
we concluded our discussion saying that there are some 
readouts that are more sensitive to some parameters than 
others. Those others play the role of nuisance parameters 
for the fit of that readout. Hence it is a good practice to 
fit only a restricted number of parameters for each read- 
out, those that are more meaningful for it. Clearly, all 



4 



independent fits provide different estimates to different 
parameters, yet, some parameters may be shared between 
experiments. When summing up within Eq. ([9| we take 
care of the different matrix sizes by putting zeros where 
we have no measurement or information. 

An alternative method for the multi-experiment pa- 
rameter estimation is based on building a joint likeli- 
hood of all experimental outputs and models for each 
experiment. Since the typical identification experiment 
on flight will not last enough to let the cross correlation 
become important between different channels (and surely 
between different experiments), the hypothesis of statis- 
tical independency is reasonable and the joint likelihood 
is given by 



N,.- 



£(data|p) = J| dj (data »j |p) 



(11) 



where i counts the experiments and j the channels per 
experiments. Analogously, the joint log-likelihood norm 



is 



x 2 (p) = E E4(p) 

<=1 3=1 



(12) 



Assuming that all channels are sampled at the same rate 
and last for the same duration, the overall number v of 
degrees of freedom are defined as v = A oxps x A c h s x 
Adata — AT p , where A p is the dimension of the parameter 
space. 

Our analysis has shown that on simulated data the 
joint fit and the combination of independent fits provide 
compatible estimates within the confidence level. How- 
ever, the inaccuracy of a model for a channel might bias 
the fit. When this occurs, the information weighed mean 
of Eq. ([9| is not robust and it amplifies the bias in the 
combined estimate. In this case, one can try to remove 
that estimate and combine the remaining: however, by 
doing so information and precision would definitely be 
lost. The joint analysis is more robust to such kind of 
problems: the poor information from the badly-fitted 
model is compensated by the others. We will therefore 
adopt the joint approach for the rest of the paper. 



B. Whitening 

Let us consider, for simplicity, the case of one experi- 
ment with only one reading and assume that stationarity 
holds true. (In general, one needs to consider also the 
cross correlation between different channels and experi- 
ments.) Eq. (6]) can be rewritten in term of the self-inner 
product of the residual vector. Then, without loss of 
generality, there exists an orthogonal matrix U and a 



diagonal matrix A n such that 
X 2 (p) = rt(p).C- n 1 .r(p) 



rt(p). (W-A^-Wn -r(p 



(13) 



By diagonalizing the noise covariance matrix, data get 
decorrelated and this is equivalent to whitening the data 
in the frequency domain. The likelihood estimation shall 
now be performed on the transformed residuals, obtained 
by the application of the operator 11^ (the whitening fil- 
ter), on the residual vector r(p), along the eigen-direction 
of the noise. 

The same formalism can be applied to the calculation 
of noise generating functions |10j . 



C. The estimation method 

The parameter estimation templates, to be compared 
to the experimental data, are calculated from the avail- 
able matrix H(u>,p) of transfer function models. For an 
experiment having N inputs, contained in the vector i, 
and M outputs, contained in the vector o, the matrix has 
size N x M. The modeled (" symbol) system outputs, are 
given by 

6(t,p)=J- 1 [H( W ,p).i( W )](t), (14) 

where J 7 " 1 [•](£) stands for the standard inverse Fourier 
transform. Here we assume that all parameter informa- 
tion is contained within the transfer function matrix and 
the inputs are not parametric. Indeed, we want to study 
the system and the injections are usually fully known. 

In order to compute the Fisher information matrix, one 
also needs the first-order derivatives of the models. This 
is obtained by the following 



V p 6(t,p)=^ 1 [VpH(a;,p)-iH] (t) 



(15) 



where V p is the vector of derivatives with respect to each 
component of p. This quantity is usually called the model 
Jacobian. The whitened Jacobian multiplied by its trans- 
pose gives the information matrix. 

Therefore, it's easy to identify the iteration steps 
needed to produce the final parameter estimates, in loops 
of increasing accuracy: 

1. the whitening filters are estimated from a long noise 
run; 

2. data are whitened; 

3. the templates are generated from the transfer func- 
tions matrix (see Eq. (14)); 



4. the templates are whitened; 

5. data are fitted by adjusting the parameters itera- 
tively and generating new whitened templates (step 
3 through 4). 



5 



IV. DATA ANALYSIS 

In this section we want to validate the method in- 
troduced above by applying it to mock data. Firstly, 
we describe the data generation: a long noise run and 
some injection experiments are simulated. Then, we 
show the whitening/decorrelation process: here whiten- 
ing filters are provided for the estimation. Thereafter, 
we go through the core step of the parameter estimation 
where we describe the optimization scheme and show 
some results. We prove its consistency with a Monte 
Carlo simulation and its robustness to altered bound- 
ary conditions: namely the initial guess (corresponding 
to an under-performing system configuration), and non- 
gaussianities (the presence of glitches in the readout). 
Then, we will show the results of the application of the 
same methodology to data produced by a realistic LPF 
simulator provided by ESA. Finally, we discuss the rel- 
evance of these methods in correctly computing the ex- 
ternal residual force noise per unit mass. 

Data production and analysis are performed with the 
LISA Technology Package Data Analysis Toolbox (LT- 
PDA) pi], an object-oriented extension of MATLAB® 



A. Experiments and data generation 

A long noise run lasting 28 hours has been produced by 
coloring white Gaussian noise with realistic noise shaping 
filters [10] and assuming stationarity and normal distri- 
bution. During the real mission, this long noise run has 
two uses: to investigate the noise behavior, and to pro- 
duce whitening filters for the parameter estimation. 

It is worth making some comments on the assumption 
about the noise stationarity. For LPF, noise is a stochas- 
tic process that we can model as a function of time and 
some system parameters, say n = n(t,p(t)) in the case 
of a single parameter p. A fluctuation Sp around the 
nominal value po due to a time-dependency gives — to 
first order — n ~ uq + n'Sp, where no = n(t,po) and 
n' = dn(t,p)/dp\ . Then, for a zero-mean process, the 



from the previous long noise run, produced determinis- 
tic signals following the recipe of Eq. ( 14 1 and assumed 



ipo . 

total noise variance is 

Var[n] ~ Var[no] + Var'[no]<5p + Var[n']<5p 2 



(16) 



where the linear and quadratic terms come from the co- 
variance between no and n' and the variance of n! itself 
(see Appendix|B]for details). Therefore, if any of the sys- 
tem parameters changes in time, the noise is likely to be- 
come non-stationary. The converse (i.e., non-stationarity 
implies a variation of the parameters) is not true, since 
other effects, independent from those parameters, may 
still be relevant. For example (Section IV C 3), glitches 
are a non-stationary noise behavior intrinsic in the read- 
out. 

We simulated both experiments described at the end 
of Sect ion QTJ for a total duration of 3 hours each. We im- 
plemented a different and independent noise realization 



that the superposition principle of signals and noise holds 
true in the hypothesis of small motion and in absence of 
non-linearities. 

An example of the two generated experiments with 
the respective injections is shown in Fig.[TJ A logarith- 
mic sweep of increasing frequency / = 8.3 x 10" 4 , 1.7 x 
10" 3 , 3.3xl0- 3 , 6.7xl0- 3 , 1.3xl0" 2 , 2.7xl0" 2 , 5.3xlO" 2 Hz 
(from 1 up to 64 cycles, each single sine stretch lasting 
1200 s), is injected into the system, once in the first guid- 
ance Oj.i, then in the second o;.i2, to produce the two 
experiments. The amplitudes are selected not to exceed 
1% of the operating range of the instrument and 10% of 
the maximum allowed forces. In the first experiment the 
system response in the o\ channel resembles the origi- 
nal injection Oi j, except at high frequency where there is 
a slight difference. At higher frequencies the system re- 
sponse increases due to the rising of the transfer function, 
therefore the amplitudes are lower than the correspond- 
ing ones at lower frequencies. Instead, the 012 channel 
contains an extra contribution due to the readout cross- 
talk. In the second experiment the system responds only 
in the on channel and the o\ signal content is negligible. 
Since the system response decays at higher frequencies 
we must provide injection signals with higher amplitude 
to see any effect. A phase delay is also present. 

In what follows, data have been produced at 10 Hz and 
then (since the highest frequency injected signal is around 
50 mHz) down-sampled to 1 Hz to ease data processing. 
During the mission, data will be collected at a sample 
rate between 1 and 10 Hz, depending on the experiment 
and available down-link bandwidth. 



B. Whitening niters 

The first step of the data analysis pipeline is the es- 
timation of the whitening filters from the 28-hour noise 
run. For uncorrelated channels (in the parameter esti- 
mation experiments cross-correlation can be effectively 
ignored) whitening filters are derived following the prin- 
ciple of Section |IIIB| a fit in the z domain to the inverse 
of the noise spectrum is performed. In Fig. [2] the result- 
ing effect of the whitening filters is to flatten the original 
noise curves. For each spectrum we report the estimated 
Power Spectral Density (PSD). 

Other details are summarized in Table UJ All statisti- 
cal moments estimated on the whitened time-series are 
compatible with the ones expected according to Gaus- 
sian noise, except for the mean of the differential channel 
which shows a significant departure from the zero-mean 
value. This is due to an intrinsic limitation of the whiten- 
ing process at low frequency. 
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FIG. 1. Synthetic data generated for Exp. 1, injection into the 
first controller guidance oi i, and Exp. 2, injection into the second 
controller guidance o; 12- We show the response of the system in 
both channels 01 and 012 for the two experiments, expected to not 
exceed 1 (im. We split the data to highlight the part concerning 
the parameter estimation. In Exp. 1 the response of the system 
in 01 (dashed line) is approximately equal to On, except at high 
frequency, and a residual signal in 012 is due to the cross-talk. In 
Exp. 2 the response of the system in 01 is negligible and in 012 is 
important at low frequency. 
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FIG. 2. Effect of the whitening filters on a simulated noise run. 
01 and 012 are the original noise stretches. oi w and oi2, w are the 
(flattened) whitened noise stretches. PSDs are computed with the 
Welch overlap method, 4-sample 92-dB Blackman-Harris window 
|13| . 16 averages and mean-detrending. 



Parameter estimation 



TABLE I. Sample mean /1 and standard deviation a with higher 
moments, the sample skewness 71 and the excess kurtosis 72 for 
the whitened channels o\ and 012. Assuming Gaussian-distributed 
data, the approximate standard deviations are <t m ~ u/\/N, <t ct ~ 
a/\/2N, a 11 ~ y/6/N, cr 72 ~ y/24/N, with N the number of 
samples. 



/' 



Ti 



72 



01 0.008 ±0.003 
012 -0.254 ±0.003 



0.970 ±0.002 (-5±8)xl0" a (0±2)xl0" 2 
1.002 ±0.002 (0±8)xl0~ 3 (3±2)xl0~ 2 



the information and remove the degeneracy. Moreover, 
all models are treated non-linearly. The matrix of the 
transfer function models, the inputs and outputs for each 
experiment, together with the whitening filters for all 
channels are passed to the fitting algorithm. See Fig. [3] 
for an idealized and simplified scheme that shows how the 
data and models are used by the parameter estimation 
algorithm. 
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FIG. 3. A simplified schematic of the parameter estimation algo- 
rithm. After data production, the long noise run is used to estimate 
the whitening filters. Then, all injections and system responses are 
collected and passed to the \ 2 fitting algorithm. The modeling pro- 
vides the matrix of transfer functions. By adjusting the parameter 
the x 2 is optimized to get the best-fit parameters. 

The optimization approach is mixed: a preliminary 
search with the preconditioned conjugate gradient al- 
gorithm (alternatively, the Broyden-Fletcher-Goldfarb- 
Shanno quasi-Newton method can be used) [H], using 
analytical derivatives, is followed by a derivative-free sim- 
plex algorithm. The main advantage is that a global 
minimizer with lower precision is accompanied by a local 
minimizer with higher precision and this overcomes the 
intrinsic difficulties connected with non-linear optimiza- 
tions. 



In what follows we implement the ideas of Section |III| 
on simulated data. The objective function is the log- 
likelihood (x 2 statistic in least-squares sense) which is 
optimized with respect to the parameter vector. The 
method uses all channels and experiments to maximize 



1. Monte Carlo validation 

A Monte Carlo simulation with 1000 different noise 
realizations was used to check for consistency of the 
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method. The estimation was identically repeated at each 
step, enabling fine tuning and the study of the statistics 
for every relevant physical quantity. In Table |TT] we com- 
pare the mean best-fit to the true values for all param- 
eters: the accordance is at the level of 1 or 2 standard 
deviations. We also show the sample standard deviation 
of the best-fits and the mean estimated standard devi- 
ation: the first one is the fluctuation of the parameters 
due to the noise, the second is the average fit error. 



TABLE II. Monte Carlo validation of 1000 independent noise re- 
alizations. The mean best-fits are compatible with the real values. 
The term in brackets is the error relative to the rightmost digit. 
The mean standard deviation (estimated from the fit) is of the 
same order of magnitude of the sample standard deviation of the 
best-fits. The mean \ 2 = 0.96, v = 79993. 
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4xl0" 4 


0.9999001(1) 


4xl0" 6 


2xl0" 5 


0.90004(9) 


3xl0~ 3 


4xl0" 3 


■1.303006(7) 


2xl0~ 4 


lxlO" 3 


■0.697998(6) 


2xl0" 4 


5xl0" 4 


0.059995(3) 


9x1 0" 5 


3xl0" 4 


0.05000(3) 


8x1 0" 4 


lxlO" 3 



A more in-depth analysis concerns on the parameter 
statistics reported in Fig. [4] The accordance of the sam- 
ple statistics with the theoretical Gaussian Probability 
Density Function (PDF) (evaluated at the sample mean 
and standard deviation) is self-evident. 

As for all parameters, in Fig. [5] we show the statis- 
tics for the estimated variances. Theory prescribes that 
the variance statistics should be x 2 distributed, but for 
v = 79993 the \ 2 distribution tends to a Gaussian dis- 
tribution. From the plot it is clear that they are all in 
agreement with the theoretical Gaussian PDF. 

Surprisingly, the correlations are also Gaussian dis- 
tributed with good approximation. See Fig. [6] for two 
examples. 

The correlation between two parameters is linked to 
the rotation of the local x 2 paraboloid around the min- 
imum. To support this statement, we show some plots 
of the projection of the 7-dimensional surface onto two 
parameters at a time, around the best-fit. See Fig. [7] for 
some examples. Weakly correlated parameters, like S21 
and u) 2 (~ 20%) (panel (b)) typically have the princi- 
pal axes of the contour curves aligned with the x and 
y axes. Highly correlated parameters, like ^4 SUS and ui 2 
(~ -70%) (panel (a)) have the principal axes rotated by 
a non- negligible amount. 

A whole Monte Carlo history of the \ 2 log-likelihood 
chains is recorded in Fig. [8] The scatter of the chains 
is due to the noise fluctuation among the Monte Carlo 
iterations. There are clearly some chains that are far 
away from the accumulation zone: this behavior is com- 
pletely unexpected as one would think the noise to have 
little impact on the chains location. Despite the big scat- 



ter, the asymptotic distribution is Gaussian (see Fig.|9|. 
The final, and most remarkable check, is the comparison 
between the fit \ 2 log-likelihood and the one calculated 
on pure noise data. It is important here to stress that 
both the fit and the noise x 2 at a nrs t showed agree- 
ment between each other, but they were both positively 
skewed. The following facts explain why. In Section [IVB] 
we have described the practical method to implement the 
diagonalization of the noise covariance matrix of Section 
|III B| with its limitation. This consists in the impossi- 
bility of filtering out the lowest frequencies, due to the 
finiteness of the data stretches from which whitening fil- 
ters are derived and which causes the skewness. Trans- 
parently, the application of a high-pass filter to the data 
has solved the issue. Finally, the plot provides an impor- 
tant twofold test: on a side, the parameter variances are 
statistically distributed as the fit \ 2 log-likelihood; on 
the other, the fit % 2 log-likelihood is in agreement with 
the noise \ 2 log-likelihood, showing that the estimation 
method has suppressed the systematics and estimated 
the noise statistics with no extra bias. 



2. Robustness to the initial guess 

We discuss here the robustness of the estimator to the 
choice of the initial guess. For this, we start the search 
from an initial guess for the parameters arbitrarily far 
away. Physically this corresponds to a very poor knowl- 
edge of the system or to the unlikely (but possible) situ- 
ation of under-performing actuators and underestimated 
couplings between the TMs and the SC. In both cases a 
precise calibration is needed. 

In Table [IlTl we summarize the results by comparing the 
real values, the initial guess and best-fit. We also indicate 
the bias (absolute deviation from the real value in units 
of standard deviation) for both the best-fit and the initial 
guess. We notice that almost all parameters are within 1 
standard deviation from the real ones, even though the 
starting guesses are typically at ^10 3 standard deviations 
away. 



TABLE III. Robustness to the initial guess. Initial guess at x 2 = 
1.3 X 10 5 , v = 79193; best-fit at x 2 = 0.99. The term in brackets 
is the error relative to the rightmost digit. In curly brackets the 
bias (absolute deviation from the real value in units of standard 
deviation) for each estimate is also shown. 



Real 



Best-fit 



Guess 



521 [10- 3 ] 
uj\ [10" 6 s- 2 ] 

"12 [10" 6 S- 2 ] 

Ati [s] 

Ati2 [s] 



0.62 0.61994(8) {0.77} 

0.6 0.599990(8) {1.3} 

-1.5 -1.4998(1) {0.55} 

-3 -2.9998(2) {1.1} 

-2 -2.0000(1) {0.32} 

0.6 0.6013(7) {1.8} 

0.4 0.398(2) {0.95} 



1 {4.9xl0 3 } 

1 {5.1xl0 4 } 

{4.7xl0 3 } 

-1.3 {7.8xl0 3 } 

-0.7 {l.OxlO 4 } 

{8.4xl0 2 } 

{2.3xl0 2 } 



In Fig. [TO] we show the performance of the estimation, 
showing how the method is able to suppress the \ 2 by 





FIG. 5. Monte Carlo statistics for each parameter variance (a)-(g). The scaled Gaussian PDF is evaluated at the sample mean and 
standard deviation. 



many orders of magnitude, from 1 x 10 5 to ~1, which is 
the required optimum, within the given tolerances (set to 
lxlO" 4 in both optimization function and variables). Fi- 
nally, it is clear that we can recover the real values within 
the confidence level by decreasing the merit function of 
many orders of magnitude, while keeping both accuracy 
and precision. 

Analogously, in Fig.[TT] two examples of estimation 
chains (for ui\ and lo\ 2 ) are also reported, showing the 
correlation with the big jumps of the \ 2 chains and how 
the parameters saturate to the optima. 

In the end, the analysis of residuals summarized in 
Fig. [T2|demonstrates that it is possible to completely sub- 
tract the deterministic part out of the data and reach the 
expected noise shapes (estimated from the independent 
noise run) for all experiment and channels. The improve- 
ment is evident at low frequency: for the Oyi channel 
the residuals are suppressed by ~4 orders of magnitude 



around 1 mHz. The same happens for the oi channel in 
the first experiment where the improvement is of ^2 or- 
ders of magnitude. Only the o\ channel in the second 
experiment contains no signal since the cross-talk from 
the O12 into that channel is negligible. 



3. Robustness to non-gaussianities 

This section is devoted to showing the effect of non- 
gaussianities in the noise, and how to properly handle 
them. The main realistic behavior of experimental noise 
is the presence of outliers: consequently the sampling dis- 
tribution of the data may show some prominent tails. An 
example of such outliers is the manifestation of glitches, 
very short noise transients due to anomalous response in 
the readout / circuitry. 

A standard approach, called local L-estimate [14] . is to 
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FIG. 6. Monte Carlo statistics for two parameter correlations. 
The scaled Gaussian PDF is evaluated at the sample mean and 
standard deviation. 



generalize the norm of residuals in Eq. Q , since it usu- 
ally overweighs the outliers. The idea is to properly take 
care of them by regularizing the usual square of residuals 
with other definitions of the norm. As an example, three 
possible definitions are considered 



(r(p)|r(p)> 



mean squared dev. 
mean absolute dev. 
mean logarithmic dev. 



(17) 



corresponding to the Gaussian, log-normal and 
Lorentzian distribution, respectively. The subscript 
i counts the data and the channels/experiments. 

The method can be successfully applied to data with 
glitches. Noise glitches are unpredictable high frequency 
noise transients mostly due to failures in the circuitry. 
Such outliers usually fall well beyond 3 standard devia- 
tions and produce an excess at the tails of the statistic. 
Since the output of the interferometer might be subject 
to similar phenomena, we simulate a realistic experiment 
containing glitches and model such transients as Sine- 
Gaussian (SG) functions 



«gl(*) =asin(27r/ (i-io))exp 



(18) 



where the SG parameters span a wide (uniformly dis- 
tributed) range of values. In particular, the SG fre- 
quency, /o, covers the whole bandwidth (10" 4 — 0.45) Hz; 



the injection time, to, is distributed all along the whole 
time-series; the characteristic time, r, which gives the 
typical duration of the pulse is (1 — 2) s; the amplitude, 
a, is defined in terms of the number of the original noise 
standard deviations and falls outside the Gaussian statis- 
tic by (3 — 20) <r n . Moreover, we fix the number of SG 
injections as a fractional part of the whole data series: 
we conventionally choose /sg = A^G/A^data = 1%, since 
higher values are very unlikely. Notice that this value rep- 
resents only the number of injections: the actual fraction 
of corrupted data will be of the order of 3E[t]/ sg ~ 5%. 

Glitchy noise is readily produced by coloring a white, 
zero-mean, unitary standard deviation input time-series, 
corrupted by random injections of SGs. See Fig. [F3| for 
an example of injection of glitches into the differential 
channel. The simulated noise is now what the experiment 
would give us during the real-life mission. 

The effect of these glitches is that the PSD of the 
simulated noise scales linearly with the frequency, up to 
4x lO^rnHz" 1 / 2 and 6 x 10" 11 m 2 Hz" 1 around 0.2 Hz for 
the first and differential channel, respectively. This ex- 
cess noise sums up to the original one and affects its high 
frequency components (see Fig. 14). Obviously, the new 
statistic contains an excess at the tails. For example, the 
o\ channel has a kurtosis of 19, compared to the original 
one of -9x 10" 3 . No significant difference in skewness is 
detected since the statistic does not loose symmetry after 
the SG injections. 

In order to perform a realistic parameter estima- 
tion, whitening filters are derived from the glitchy noise 
stretches with the same procedure described in Section 
|IVB| However, since the whitening works only in the sta- 
tionary regime and the glitches are highly non-stationary, 
it is practically impossible to filter them out of the data. 

Table [TV] shows the results of the parameter estimation 
using the three norm definitions introduced above. 

The most conservative least square estimator provides 
overestimated errors since they scale as ~\Zx^. The ab- 
solute and logarithmic deviation provide better statistics 
and lower errors, but the first gives biased estimates of 
A sus , Ati and Ai 12 and the second a slightly biased esti- 
mate of S21. The analysis of residuals (not shown here) 
demonstrates that the methods are in agreement with 
each other and the noise shapes. From the performance 
point of view, these estimators are 30% and 9% faster 
than the Gaussian (mean squared deviation). As we can 
see, there is no absolute rule we can apply when dealing 
with glitches. However, from the difference between the 
methods we can infer the sensitivity of each single param- 
eter to glitches. For example, we can assume the ratio 
between the biases as the criterion for comparing two 
methods. The ratio tends to one if that parameter is not 
sensitive to glitches; otherwise, tends to a very small or 
big number. The comparison between the mean squared 
deviation and the mean logarithmic deviation gives that 
S21 is the most sensitive parameter, whilst Aii2 the least. 
Therefore, starting from the fact that the methods give 
the same result for purely Gaussian noise, our proposed 
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FIG. 7. Fit x 2 local curvature around the best-fit. The 7-dimensional surface has been projected onto two parameters at a time for some 
examples. The correlation is responsible for the rotation of the surface. 



TABLE IV. Robustness to glitches. We compare the norm for squared, absolute and logarithmic deviation with v = 79193. The term in 
brackets is the error relative to the rightmost digit. In curly brackets the bias for each estimate is also shown. 





Real 


Best-fit 


Best-fit 


Best-fit 


Guess 






(mean sq. 


dev.) 


(mean abs 


dev.) 


(mean log. 


dev.) 




Norm 




10 




2.1 




0.95 








1.01 


1.011(3) 


{0.29} 


1.010(1) 


{0.23} 


1.0109(8) 


{1.2} 


1 


•^SUS 


0.99 


0.99000(5) 


{0.035} 


0.98959(2) 


{20} 


0.99001(1) 


{0.99} 


1 


5 2 i [10- 4 ] 


1.1 


1.10(2) 


{0.074} 


1.113(7) 


{1.8} 


1.116(5) 


{3.4} 





Ji [io-V 2 ] 


-1.32 


-1.320(1) 


{0.061} 


-1.3188(6) 


{2.0} 


-1.3192(4) 


{2.0} 


-1.3 


[io-V 2 ] 


-0.68 


-0.6798(7) 


{0.29} 


-0.68000(3) 


{0.011} 


-0.6804(2) 


{1.8} 


-0.7 


Ah [s] 


0.1 


0.100(3) 


{0.045} 


0.090(1) 


{8.3} 


0.1007(8) 


{0.90} 





Ati2 [s] 


0.1 


0.098(5) 


{0.36} 


-0.0290(2) 


{58} 


0.098(2) 


{1.2} 






recipe is the following: 

1. apply the conservative approach (the ordinary 
mean squared deviation) directly to corrupted time 
series and try with different estimators (mean ab- 
solute deviation, mean logarithmic deviation, etc.); 

2. start to remove some outliers giving them negligible 
weight; 

3. redo the analysis with all estimators; 

4. check for convergence and agreement between the 
estimators. 

The whole process is actually a reweighing analysis that 
provides robust uncertainties and finally removes the out- 
liers in a step-by-step smooth readjustment. Even though 
it would be possible in principle to clean up the data just 
before the estimation, in that case the results would likely 



be dependent on the statistical criterion. The main ad- 
vantage of our recipe is its robustness and the fact that 
the data polishing is strictly step-by-step and model in- 
dependent. 



D. Analysis of operational exercises 

The estimation of parameters has been a pivotal sub- 
ject during many data analysis sessions in view of the 
LPF mission. The core experiments in dynamics are pur- 
posefully described in the LPF Experiment Master Plan 
and envision system identification methods of the kind 
described in Section QU 

With the aim of validating the data analysis effort 
and the conversion of the science strategies into tele- 
commands and on-board instructions (Payload Opera- 
tions Requests), several extended data analysis exercises 



11 



Flux tube for fit % chains 



Fit% chain 



2.4 
2.2 
2 
1.8 

1.4 
1.2 
1 

0.8, 







200 



400 600 
Iteration 



800 



1000 



FIG. 8. Monte Carlo fit \ 2 chains. The processes typically last for 
~1000 iterations and stop when either the function or the variable 
tolerance is below 1 X 10" 4 . 
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FIG. 10. Fit x 2 chain that shows the estimation performance from 
~1 X 10 5 to the optimum ~1. The process lasts for 1636 iterations 
and stops when either the function or the variable tolerance is below 
1 X 10" 4 . A preliminary global gradient search is followed by a local 
simplex. 
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FIG. 9. Statistics for \ 2 log-likelihood obtained from the fit and 
calculated on the noise alone. The agreement between the two 
shows that the deterministic part of the data is statistically sup- 
pressed. 



were called in the past 2 years. At first they took the 
form of Mock Data Challenges with two parties involved: 
data generation and data analysis. Rapidly the need of 
closing and testing the chain from science to payload de- 
manded the evolution of these data analysis sessions into 
real Operations Exercises with a few team leaders in co- 
location and the Principal Investigators' teams on call, 
to mimic the real mission time. 

In absence of the real LTP a simulator was used to 
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FIG. 11. Estimation chains for tJ\ and 

and simplex searches. The jumps are correlated to those of the \ 2 
The saturation to the best-fit is self-evident. 



generate the data. The Offline Simulation Environment 
(OSE, sometimes called Drag Free and Attitude Control 
System end-to-end simulator) provided by ASTRIUM 
|15) is a piece of software embedding the same dynamics 
and most of the software that will run on the on-board 
computer of LPF, with the advantage of being coded in 
a set of C/C++ modules called by a MATLAB® and 
Simulink® 16, engine, therefore easier to handle than the 
real operational machinery. The OSE contains detailed 
State-Space models of the most relevant disturbances and 
allows for their fine-tuning and (de)activation; it simu- 
lates the dynamics, the capacitive and thrusters dispatch- 
ing algorithms, handles possible couplings between differ- 
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FIG. 12. Analysis of residuals for all simulated experiments and channels. Initial and best-fit residuals are shown and compared to the 
expected noise shapes. The improvement on the 012 channel of both experiments (b) and (d) is of ~4 orders of magnitude around 1 mHz. 
For the 01 channel in the first experiment (a) 3 orders of magnitude; (c) contains no signal. PSDs are computed with the Welch overlap 
method, 4-sample 92-dB Blackman-Harris window, 16 averages and mean-detrending. 
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FIG. 13. Original and glitchy noise for the 012 channel. The level 
of data corruption is evident. 



ent degrees of freedom as well as the controller choices 
and decoupling strategy proper of each driving mode of 
LPF. The OSE is thus equivalent to an-silico LPF labora- 
tory, crucial for characterizing the behavior of the signals 
and telemetry on-board. 

The first phase of validation of the system identifica- 
tion experiments culminated with the sixth Operational 
Exercise, where the concepts explained in Sections |III| 
and IV insofar were applied to the OSE telemetry. The 



Exercise targeted the estimation of parameters using a 
linear fit with Singular Value Decomposition, a non- 
linear fit (insofar described) and a Markov-chain Monte- 
Carlo method. 

In practice, for the linear and non-linear fit the follow- 
ing recipe was followed: 
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FIG. 14. Spectra for simulated glitchy noise compared to the 
original for o\ and 012 channels. The high frequency bump around 
0.2 Hz is evident. PSDs are computed with the Welch overlap 
method, 4-sample 92-dB Blackman-Harris window, 16 averages and 
mean-detrending. 



1. identification of the time section where the signals 
had been injected; 

2. creation and training of whitening filters from the 
60000 s of noise data before the injection; 

3. whitening 

4. creation of numerical template for the injected sig- 
nals and the dynamics 

5. fitting of parameters 

6. subtraction of the so modelled signal from the data 
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7. estimation of the residual acceleration noise using 
the estimated parameters. 



The parameters (see Eq. (A4)) were not fitted directly 



for numerical stability issues: it is more efficient to fit the 
discrepancies with regard to the nominal values, guessed 
from common sense or obtained from ground-testing ex- 
periments. See Table |V| for a full list. Physical parame- 
ters can be recovered inverting the parameters expansions 
and the fit uncertainty can be propagated according to 
the uncertainty propagation theory, e.g. 

Var [A d{ ] =Var [6 An] 

Var [ul] = (1.9xlO" 6 ) 2 Var [Sujf] + (lxlO" 7 ) 2 Var [5u 2 12 ] 
+ 2 x (1.9 xlO" 6 ) x (lxlO" 7 )Cov [<5w 2 ,(5w 2 2 ] 

(19) 

The expected errors on parameters were estimated as 
optimal errors by inverting the Fisher information ma- 
trix: the error is therefore function of the input signals, 
the model used to describe the experiments and the noise. 

The techniques elucidated in Section |IVC| were em- 
ployed with a linear and non-linear fitting scheme. Dur- 
ing the Exercise a Bayesian method using Markov chains 
was used too, a detailed description of which is to be 
found in [5] . The agreement between the methods is very 
good and their comparison will be discussed elsewhere. 

As a matter of fact, the application of the whitening 
and fitting techniques to an operational scenario proved 
to be an excellent blind test in guessing the true values 
of the parameters in Table [V] The values obtained in 
the fit matched the true values within a few standard 
deviations but showed some extra systematics and over- 
shooting. The issue was highlighted by the analysis of 
residuals which showed the full matching of the noise, 
with the exception of a little mismatch at high frequency. 
The OSE is in fact a Multi-Input-Multi-Output simula- 
tor whose internal structures reflect a more realistic and 
cross-connected model of LPF, not necessarily embed- 
ding our reduced matrices (see Eq. (A4|) without extra 
non-diagonal or cross-controller contamination terms. In 
particular, the OSE engine dispatching forces and torques 
to the (virtual) LPF actuators is designed around a de- 
coupling strategy that — in spite of the name — truly 
couples the dynamics of several degrees of freedom to at- 
tain a cleaner and steadier readout in the 012 channel. 
More experiments and an enhanced model shall shine 
some light on the matter and improve our understand- 
ing. 



E. Estimation of residual force noise 



plan, the required level of differential force noise must be 
achieved. If we are pessimistic and assume our knowledge 
of the key system parameters is poor, or equivalently we 
have an under-performing system (as in Section IV C 2 1 , 
we show here that without a precise calibration of the 
system dramatic systematic errors might arise. 

In the analysis of operation exercises we have stressed 
the importance of parameter estimation in the evaluation 
of the residual force noise. 



The solution of the system equations ( Al )-( A3 ) allows 



for an exact computation of the out-of-loop residual force 
noise: in fact, by applying the operator of Eq. Q on 
noisy interferometric data the control and the all known 
forces can be isolated and subtracted. We have simulated 
the LPF noise by means of a parametric projection of the 
individual noise sources to the interferometric readout. 
By feeding the noise sum to the interferometer model, 
the readouts were simulated in turn. Such computation 
has been performed assuming the real parameters (true 
noise), the initial guess (without identification) and the 
best-fit (with identification). 

The result of such a computation on a very long noise 
run, ^6 days, is reported in Fig.[l5] We show there 
the PSD of the residual force noise per unit mass in the 
012 channel. As is clear from the plot, the best-fit curve 
reconstructs the true noise shape. The curves can be 
compared with a theoretical projection of the interfer- 
ometer noise model to the residual force noise, following 
the prescription 

Su,/(W,P) = H^ f (w,p) • S n , (u>,p t rue) ' H ^./(w,p) , 

where H _*/ (w, p) is the matrix of transfer function mod- 
els of Eq. Q and S ni0 (w, ptruc) is the cross-spectral den- 
sity matrix of the interferometer readout evaluated at the 
true parameter values. The plot shows the noise models 
obtained maintaining S n . G (u;, ptruc) fixed and evaluating 
H ^f(uj, p) at the initial guess (model without identifica- 
tion) and the true values (model with identification) . The 
agreement between the numerical and theoretical com- 
putation is good in the whole frequency band. It is also 
clear that there is a difference between the residual force 
noise assuming the initial parameter estimates and the 
true ones (or the best-fit which completely overlaps it). 
The improvement in the estimation of the residual force 
noise is a factor 4 around 0.4 mHz and a factor 2 around 
50 mHz. The biggest contribution to this difference is 
due to the under-performing actuators at high frequency 
and the force coupling between the SC and the TMs at 
low frequency. The conclusion is that without any kind 
of system identification the residual force noise would be 
overestimated, especially at low frequencies. 



This final part justifies the importance of the method 
proposed insofar. As said, the main scientific target of 
LPF is the characterization of the TM to TM laser link to 
allow for detection of GWs in LISA. In order to fulfill this 



V. CONCLUDING REMARKS 

This work has focused on the maximum likelihood pa- 
rameter estimation in time domain for the LPF mission. 
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TABLE V. Physical parameters for LPF and their expansion around nominal (or ground tested) values to obtain the parameters used for 
fitting. The true value was set in the OSE configuration, while the error was computed as the inverse of the Fisher information matrix. 
Initial values for the fit were set to for all fitting parameters. The rationale behind the choices is briefly described. 



Physical parameter and func- 


Irue value 


Jixpected 


Fit 


Initial 


Rationale 


tional expansion 


(simulator) 


error 
(Fisher) 


parameter 


guess 




Adi ~ 1 + oAat 


1 


4 x 10" 4 


£ A 





AblKIUM, controller specmcation. 


-^4-sus — 1 ~\~ ^-^sus 


1 


2x 10" 5 







AblKlUM, controller specmcation. 


Sn ~ 1 + SSu 


1 









Ground-testing measure: 0.0000(1). 
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Ground-testing measure: 0.000(1). 


S21 — SS21 
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3xl0" 7 
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FIG. 15. Total residual force noise (per unit mass) numerically estimated on synthetic data and compared to theoretical noise models 
obtained from a projection of fundamental noise sources. We show different results for numerical estimates and models corresponding to 
the assumption of the initial parameters (without identification), the best-fit (with identification) and the true parameters (true). The 
agreement between the models and the estimated noise is clear. The parameter estimation method described in this paper allows the full 
reconstruction of the true noise shapes and an improvement of a factor 4 at low frequency. PSDs are computed with the Welch overlap 
method, 4-sample 92-dB Blackman-Harris window, 16 averages and mean-detrending. 



After introducing the dynamical equations and a model 
for LPF, with its physical parameters and their signifi- 
cance, we have shown how to handle the effect of the con- 
troller, measure all known forces and subtract them from 
the data in order to provide an estimate of the residual 
force noise acting on the TMs. We have discussed our 
multi-experiment /multi-channel approach as a method 
to reach the desired measurement accuracy. We have 
started the discussion with a Monte Carlo simulation of 
different noise realizations, showing the statistical consis- 
tency of the method. Considering the physical situation 
where our knowledge of the system is not sufficient or 
the system is highly under-performing, we have tested 



the algorithm with respect to the choice of the initial 
guess and we have demonstrated that it has the ability 
to fully recover all parameters (within one standard de- 
viation) from a reasonable initial guess. Thereafter, to 
check the robustness to non-gaussianities — e.g., glitches 
in the interferometric readout — we have simulated cor- 
rupted time-series and repeated the analysis. The result 
is that we are still able to recover the parameters and 
identify those more sensitive to non-gaussianities. We 
have also proposed a method to handle corrupted data 
that consists of a step-by-step reweighing estimation. The 
same methodology was employed to analyze data pro- 
duced by a realistic LPF simulator in use at ESA: the 
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scope is to enter into a mission-like routine, operating a 
system where many parameters are unknown or handled 
by a dedicated computer. Finally, since the final scope of 
LPF is the characterization of the LISA Doppler link, we 
have proven that the proposed parameter estimation is 
mandatory to correctly assess the estimation of the dif- 
ferential residual force noise and avoid systematic errors 
at a level which would impact on the GW astronomy in 
the lower end of the LISA band. 



Appendix A: LPF dynamics in detail 

The controlled dynamics of LPF [51 [T7] is described by 
the following equations 

D q = g , (Al) 
g = f — C • (o — T • Oi) , (A2) 
o = S • q + o n . (A3) 

The total forces g produce the motion through the acting 
of the dynamics matrix D onto the physical coordinates 



q. These forces are decoupled into external forces f (con- 
taining stochastic and deterministic signals) and control 
forces C-(o — T-Oj), where C is the control matrix and Oj 
are hardware injections at the level of the interferometer 
reference set-point, named controller guidance inputs. T 
contains possible delays in the application of the thruster 
and the electrostatic suspension actuation. Finally, the 
interferometer readouts o are related to the physical co- 
ordinates q through the sensing matrix S and corrupted 
by the readout noise o n . 

A straightforward procedure, described in [18] , is capa- 
ble of solving the problem of computing the out-of-loop 
external force per unit mass by subtracting the effect of 
the controller. 

A mono-dimensional model for LPF along the sensi- 
tive axis was previously described [5]. The dynamics is 
characterized by only two degrees of freedom and the two 
interferometer readings are the o\ channel (the relative 
displacement of the optical bench to the reference TM) 
and the Oi 2 or differential channel (the relative displace- 
ment of the second TM to the reference TM). In the 
hypothesis of small motion, the vectors and matrices can 
be written as 



D 




h - he + fl + fsc 
-fi + h ~ /sc->i + /sc 



m 2 



C sus (s) ' V e V ' V^i 1 



(A4) 



1.96 kg and msc — 422.7 kg are the 



where mi ~ to 2 
three body masses, ui'f and = uj'f + w 2 2 are para- 
sitic stiffness constants which model oscillator-like resid- 
ual force couplings between each TM and the SC, mostly 
coming from gravitational, electrostatic and magnetic ef- 
fects. T x ~ 4.9xl0" 9 s" 2 models the gravitational coupling 
between the TMs. /1, / 2 , /sc are external forces on the 
first TM, the second TM and the SC; fsc-n, /sc-+2 are 
coupling forces on the first and second TM by the SC; 
/2->i is a residual coupling force between the two TMs. 
Cdf(s) and C sus (s) are the controller/actuator laws along 
the sensitive axis for the drag-free and suspension loops 
commanding the SC and second TM to follow the refer- 
ence TM; A^f and A slls are two gains for the application 
of the actuation of the thrusters and the electrostatic sus- 
pension forces. Aii and Ai 2 are two delays in the pre- 
vious actuation. S21 is the lower off-diagonal element of 
S which models the sensing cross-talk from the o\ to oi 2 
channel arising from imperfect common-mode rejection. 



Appendix B: Demonstration of noise 
non-stationarity 



We want to demonstrate that the variation of any 
of the parameters with respect to time produces non- 
stationary noise. Hence we check the validity of Eq. ( 16 ) 
in Section [IV A| Expanding the noise around some nomi- 
nal parameter value po up to first order and by computing 
the variance of the noise, we get 

Var[n] ~ Var [no] + Var [n'Sp] + 2Cov [no, n'Sp] 

= Var [no] + Var [n'j Sp 2 + 2Cov [n , n'] Sp , 

(Bl) 



where Var [n'] and Cov[no,n'] are the variance of the 
noise first derivative and the covariance between the zero 
order and the first derivative. Now, for a zero-mean pro- 
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cess with finite second moment, it holds 

Cov [n ,n'] = E [n n'] - E [n ] E [n'\ 



Substituting this result back into Eq. (Bl I, we finally 



come to Eq. ( 16 1 



E 

Id 
2dp 



1 d 



- 

2 dp 
Varfnl 



(B2) 
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